Overview

Dataset statistics

Number of variables10
Number of observations14609
Missing cells0
Missing cells (%)0.0%
Duplicate rows32
Duplicate rows (%)0.2%
Total size in memory1.1 MiB
Average record size in memory80.0 B

Variable types

Numeric10

Alerts

Dataset has 32 (0.2%) duplicate rowsDuplicates
Fwd Pkt Len Mean is highly overall correlated with Flow Duration and 4 other fieldsHigh correlation
Fwd IAT Mean is highly overall correlated with Pkt Len Var and 2 other fieldsHigh correlation
Pkt Len Mean is highly overall correlated with Flow Duration and 4 other fieldsHigh correlation
Pkt Len Var is highly overall correlated with Fwd IAT MeanHigh correlation
Pkt Size Avg is highly overall correlated with Flow Duration and 6 other fieldsHigh correlation
Fwd Seg Size Avg is highly overall correlated with Flow Duration and 4 other fieldsHigh correlation
Bwd Seg Size Avg is highly overall correlated with Flow Duration and 4 other fieldsHigh correlation
Active Mean is highly overall correlated with Pkt Len VarHigh correlation
Idle Mean is highly overall correlated with Flow Duration and 2 other fieldsHigh correlation
Flow Duration is highly overall correlated with Fwd Pkt Len Mean and 5 other fieldsHigh correlation
Pkt Len Var has 218 (1.5%) zerosZeros
Active Mean has 699 (4.8%) zerosZeros
Idle Mean has 599 (4.1%) zerosZeros

Reproduction

Analysis started2022-12-12 12:36:21.531731
Analysis finished2022-12-12 12:36:37.122526
Duration15.59 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

Flow Duration
Real number (ℝ)

Distinct8346
Distinct (%)57.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0176367 × 108
Minimum172154.14
Maximum1.2 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:37.188541image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum172154.14
5-th percentile8810872.7
Q11.0928317 × 108
median1.1370661 × 108
Q31.1573552 × 108
95-th percentile1.1828162 × 108
Maximum1.2 × 108
Range1.1982785 × 108
Interquartile range (IQR)6452352.1

Descriptive statistics

Standard deviation32759923
Coefficient of variation (CV)0.3219216
Kurtosis4.1111453
Mean1.0176367 × 108
Median Absolute Deviation (MAD)2189885.6
Skewness-2.4393706
Sum1.4866654 × 1012
Variance1.0732126 × 1015
MonotonicityNot monotonic
2022-12-12T20:36:37.304567image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8810872.65 661
 
4.5%
113706608.1 638
 
4.4%
111753381.8 637
 
4.4%
106772664.3 637
 
4.4%
115860880.9 615
 
4.2%
115232186.8 614
 
4.2%
105644629.1 446
 
3.1%
114162049.9 373
 
2.6%
11557810.88 373
 
2.6%
116887717.4 359
 
2.5%
Other values (8336) 9256
63.4%
ValueCountFrequency (%)
172154.1429 1
< 0.1%
191767.1111 1
< 0.1%
194978.125 1
< 0.1%
195003.25 1
< 0.1%
198201.75 1
< 0.1%
198345 1
< 0.1%
199043.0909 1
< 0.1%
201672.7333 1
< 0.1%
202731.4615 1
< 0.1%
203114.4545 1
< 0.1%
ValueCountFrequency (%)
120000000 1
< 0.1%
119997450 1
< 0.1%
119995846 1
< 0.1%
119991605 1
< 0.1%
119991198 1
< 0.1%
119989397 1
< 0.1%
119986342 1
< 0.1%
119984575 1
< 0.1%
119979525.5 1
< 0.1%
119977985.8 1
< 0.1%

Fwd Pkt Len Mean
Real number (ℝ)

Distinct8315
Distinct (%)56.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean293.87977
Minimum0
Maximum1453.6481
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:37.514274image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile66.925511
Q1207.61419
median274.00105
Q3366.62028
95-th percentile606.59422
Maximum1453.6481
Range1453.6481
Interquartile range (IQR)159.00609

Descriptive statistics

Standard deviation148.45261
Coefficient of variation (CV)0.50514744
Kurtosis1.4505709
Mean293.87977
Median Absolute Deviation (MAD)78.895791
Skewness0.6712408
Sum4293289.6
Variance22038.179
MonotonicityNot monotonic
2022-12-12T20:36:37.614297image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
509.516994 661
 
4.5%
236.6956492 638
 
4.4%
84.69290836 637
 
4.4%
66.92551068 637
 
4.4%
606.5942162 615
 
4.2%
237.9919745 614
 
4.2%
435.628899 446
 
3.1%
318.2903519 373
 
2.6%
352.8968374 373
 
2.6%
207.6141922 359
 
2.5%
Other values (8305) 9256
63.4%
ValueCountFrequency (%)
0 1
< 0.1%
12 2
< 0.1%
13.06593407 1
< 0.1%
21.88783069 1
< 0.1%
22.88549619 1
< 0.1%
28.77553422 1
< 0.1%
29.89917899 1
< 0.1%
30.92392638 1
< 0.1%
32.93133803 1
< 0.1%
32.95178572 1
< 0.1%
ValueCountFrequency (%)
1453.648116 1
< 0.1%
1453.282297 1
< 0.1%
1451.847875 1
< 0.1%
1448.992526 1
< 0.1%
1447.717813 1
< 0.1%
1446.958932 1
< 0.1%
1094.736141 1
< 0.1%
994.3990847 1
< 0.1%
989.350973 1
< 0.1%
985.3638437 1
< 0.1%

Fwd IAT Mean
Real number (ℝ)

Distinct8306
Distinct (%)56.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3163251.8
Minimum0
Maximum18700000
Zeros33
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:37.724322image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile484820.57
Q11554344.6
median3174253.3
Q34252937.5
95-th percentile6420068.6
Maximum18700000
Range18700000
Interquartile range (IQR)2698592.9

Descriptive statistics

Standard deviation1856436.2
Coefficient of variation (CV)0.58687589
Kurtosis2.711012
Mean3163251.8
Median Absolute Deviation (MAD)1229664.7
Skewness0.79870426
Sum4.6211946 × 1010
Variance3.4463555 × 1012
MonotonicityNot monotonic
2022-12-12T20:36:37.832345image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
484820.5699 661
 
4.5%
1554344.596 638
 
4.4%
4628602.588 637
 
4.4%
2835132.453 637
 
4.4%
1247332.517 615
 
4.2%
4001714.419 614
 
4.2%
7304310.57 446
 
3.1%
5323309.353 373
 
2.6%
71865.17121 373
 
2.6%
2164658.695 359
 
2.5%
Other values (8296) 9256
63.4%
ValueCountFrequency (%)
0 33
0.2%
71 1
 
< 0.1%
248.8726708 1
 
< 0.1%
453.2105263 1
 
< 0.1%
747.3846154 1
 
< 0.1%
952.3636364 1
 
< 0.1%
1588.642857 1
 
< 0.1%
1755.796297 1
 
< 0.1%
4929.447368 1
 
< 0.1%
5657.392225 1
 
< 0.1%
ValueCountFrequency (%)
18700000 1
 
< 0.1%
18000000 1
 
< 0.1%
17800000 1
 
< 0.1%
16200000 1
 
< 0.1%
15000000 9
0.1%
14200000 1
 
< 0.1%
13333333.33 1
 
< 0.1%
13333258.73 1
 
< 0.1%
13251048.25 1
 
< 0.1%
13093161.1 1
 
< 0.1%

Pkt Len Mean
Real number (ℝ)

Distinct8315
Distinct (%)56.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean292.90475
Minimum0
Maximum1453.659
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:37.946907image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile70.741203
Q1207.61226
median272.89602
Q3365.58279
95-th percentile606.62705
Maximum1453.659
Range1453.659
Interquartile range (IQR)157.97053

Descriptive statistics

Standard deviation148.31746
Coefficient of variation (CV)0.50636755
Kurtosis1.4723088
Mean292.90475
Median Absolute Deviation (MAD)80.344308
Skewness0.69013286
Sum4279045.5
Variance21998.069
MonotonicityNot monotonic
2022-12-12T20:36:38.051931image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
509.5214262 661
 
4.5%
236.6341502 638
 
4.4%
84.81351348 637
 
4.4%
70.74120257 637
 
4.4%
606.6270486 615
 
4.2%
232.4193871 614
 
4.2%
429.0506408 446
 
3.1%
315.4352585 373
 
2.6%
353.2403274 373
 
2.6%
207.6122616 359
 
2.5%
Other values (8305) 9256
63.4%
ValueCountFrequency (%)
0 1
< 0.1%
12 2
< 0.1%
13.17142857 1
< 0.1%
21.87695516 1
< 0.1%
22.85714286 1
< 0.1%
28.79294723 1
< 0.1%
29.90596072 1
< 0.1%
30.94002448 1
< 0.1%
32.9493007 1
< 0.1%
32.96631206 1
< 0.1%
ValueCountFrequency (%)
1453.658974 1
< 0.1%
1453.298329 1
< 0.1%
1451.884187 1
< 0.1%
1449.025335 1
< 0.1%
1447.760984 1
< 0.1%
1447.01227 1
< 0.1%
1094.74337 1
< 0.1%
994.1877578 1
< 0.1%
989.3419161 1
< 0.1%
985.4146134 1
< 0.1%

Pkt Len Var
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct8243
Distinct (%)56.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2765.409
Minimum0
Maximum57366.062
Zeros218
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:38.164956image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile81.800395
Q11159.7846
median2831.2342
Q33597.5717
95-th percentile6310.6169
Maximum57366.062
Range57366.062
Interquartile range (IQR)2437.7871

Descriptive statistics

Standard deviation2424.6645
Coefficient of variation (CV)0.8767833
Kurtosis63.841706
Mean2765.409
Median Absolute Deviation (MAD)1363.1487
Skewness4.4920839
Sum40399860
Variance5878997.7
MonotonicityNot monotonic
2022-12-12T20:36:38.267979image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
190.1616446 661
 
4.5%
1187.088024 638
 
4.4%
3579.009639 637
 
4.4%
357.670288 637
 
4.4%
2990.705458 615
 
4.2%
6012.953024 614
 
4.2%
7903.578068 446
 
3.1%
3352.507692 373
 
2.6%
81.80039475 373
 
2.6%
1721.908717 359
 
2.5%
Other values (8233) 9256
63.4%
ValueCountFrequency (%)
0 218
1.5%
0.008297586 1
 
< 0.1%
0.016666667 1
 
< 0.1%
0.023581067 1
 
< 0.1%
0.048076923 1
 
< 0.1%
0.052067697 1
 
< 0.1%
0.063186813 1
 
< 0.1%
0.080132451 1
 
< 0.1%
0.082417582 1
 
< 0.1%
0.083333333 1
 
< 0.1%
ValueCountFrequency (%)
57366.06247 1
< 0.1%
46694.44444 2
< 0.1%
45395.73773 1
< 0.1%
45026.78571 2
< 0.1%
38564.85955 1
< 0.1%
31498.16113 1
< 0.1%
27645.23103 1
< 0.1%
27634.05654 1
< 0.1%
27230.98488 1
< 0.1%
26781.56183 1
< 0.1%

Pkt Size Avg
Real number (ℝ)

Distinct8322
Distinct (%)57.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean322.14837
Minimum0
Maximum1455.125
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:38.379004image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile74.51799
Q1211.76099
median279.26893
Q3399.75337
95-th percentile741.88074
Maximum1455.125
Range1455.125
Interquartile range (IQR)187.99239

Descriptive statistics

Standard deviation189.56092
Coefficient of variation (CV)0.58842737
Kurtosis0.84974786
Mean322.14837
Median Absolute Deviation (MAD)81.35427
Skewness1.0113427
Sum4706265.6
Variance35933.343
MonotonicityNot monotonic
2022-12-12T20:36:38.480574image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
741.8807375 661
 
4.5%
237.8101539 638
 
4.4%
86.81375316 637
 
4.4%
74.51798978 637
 
4.4%
608.358666 615
 
4.2%
236.4514086 614
 
4.2%
435.6487973 446
 
3.1%
524.0790918 373
 
2.6%
328.4370947 373
 
2.6%
208.8986524 359
 
2.5%
Other values (8312) 9256
63.4%
ValueCountFrequency (%)
0 1
< 0.1%
12.1 1
< 0.1%
12.10084034 1
< 0.1%
14.30498866 1
< 0.1%
23.03030303 1
< 0.1%
23.26830732 1
< 0.1%
30.12226602 1
< 0.1%
30.64673227 1
< 0.1%
30.97794118 1
< 0.1%
33.06491228 1
< 0.1%
ValueCountFrequency (%)
1455.125 1
< 0.1%
1455.034648 1
< 0.1%
1454.902481 1
< 0.1%
1451.18806 1
< 0.1%
1450.309859 1
< 0.1%
1449.977459 1
< 0.1%
1141.534762 1
< 0.1%
1081.8 1
< 0.1%
1037.773477 1
< 0.1%
1037.500333 1
< 0.1%

Fwd Seg Size Avg
Real number (ℝ)

Distinct8315
Distinct (%)56.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean293.87977
Minimum0
Maximum1453.6481
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:38.589598image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile66.925511
Q1207.61419
median274.00105
Q3366.62028
95-th percentile606.59422
Maximum1453.6481
Range1453.6481
Interquartile range (IQR)159.00609

Descriptive statistics

Standard deviation148.45261
Coefficient of variation (CV)0.50514744
Kurtosis1.4505709
Mean293.87977
Median Absolute Deviation (MAD)78.895791
Skewness0.6712408
Sum4293289.6
Variance22038.179
MonotonicityNot monotonic
2022-12-12T20:36:38.790643image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
509.516994 661
 
4.5%
236.6956492 638
 
4.4%
84.69290836 637
 
4.4%
66.92551068 637
 
4.4%
606.5942162 615
 
4.2%
237.9919745 614
 
4.2%
435.628899 446
 
3.1%
318.2903519 373
 
2.6%
352.8968374 373
 
2.6%
207.6141922 359
 
2.5%
Other values (8305) 9256
63.4%
ValueCountFrequency (%)
0 1
< 0.1%
12 2
< 0.1%
13.06593407 1
< 0.1%
21.88783069 1
< 0.1%
22.88549619 1
< 0.1%
28.77553422 1
< 0.1%
29.89917899 1
< 0.1%
30.92392638 1
< 0.1%
32.93133803 1
< 0.1%
32.95178572 1
< 0.1%
ValueCountFrequency (%)
1453.648116 1
< 0.1%
1453.282297 1
< 0.1%
1451.847875 1
< 0.1%
1448.992526 1
< 0.1%
1447.717813 1
< 0.1%
1446.958932 1
< 0.1%
1094.736141 1
< 0.1%
994.3990847 1
< 0.1%
989.350973 1
< 0.1%
985.3638437 1
< 0.1%

Bwd Seg Size Avg
Real number (ℝ)

Distinct7111
Distinct (%)48.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean293.13272
Minimum0
Maximum1460
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:38.900668image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile89.945455
Q1210.16667
median273.69231
Q3367.57895
95-th percentile610.5
Maximum1460
Range1460
Interquartile range (IQR)157.41228

Descriptive statistics

Standard deviation147.3605
Coefficient of variation (CV)0.50270916
Kurtosis1.6346008
Mean293.13272
Median Absolute Deviation (MAD)77.307692
Skewness0.76221544
Sum4282375.9
Variance21715.118
MonotonicityNot monotonic
2022-12-12T20:36:39.004694image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
510 662
 
4.5%
239.4705882 638
 
4.4%
90.28571429 637
 
4.4%
93.22222222 637
 
4.4%
610.5 615
 
4.2%
217.3333333 614
 
4.2%
407.625 446
 
3.1%
307.4444444 373
 
2.6%
351 373
 
2.6%
213.7272727 359
 
2.5%
Other values (7101) 9255
63.4%
ValueCountFrequency (%)
0 1
 
< 0.1%
12 2
< 0.1%
13.85714286 1
 
< 0.1%
21 1
 
< 0.1%
21.14285714 1
 
< 0.1%
22 4
< 0.1%
24.54545455 1
 
< 0.1%
27.33333333 1
 
< 0.1%
30.28571429 1
 
< 0.1%
32 1
 
< 0.1%
ValueCountFrequency (%)
1460 6
< 0.1%
1098 1
 
< 0.1%
998.3333333 1
 
< 0.1%
996.3333333 1
 
< 0.1%
990.6666667 1
 
< 0.1%
977.3333333 4
< 0.1%
890 1
 
< 0.1%
886.8 2
 
< 0.1%
855.4285714 1
 
< 0.1%
817.5 1
 
< 0.1%

Active Mean
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct8053
Distinct (%)55.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean615272.1
Minimum0
Maximum24550000
Zeros699
Zeros (%)4.8%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:39.115719image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile991.43647
Q177919.63
median239639.84
Q3576768.97
95-th percentile3058145.8
Maximum24550000
Range24550000
Interquartile range (IQR)498849.34

Descriptive statistics

Standard deviation1278281.3
Coefficient of variation (CV)2.077587
Kurtosis52.241469
Mean615272.1
Median Absolute Deviation (MAD)184845.98
Skewness5.60469
Sum8.9885101 × 109
Variance1.6340031 × 1012
MonotonicityNot monotonic
2022-12-12T20:36:39.225744image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 699
 
4.8%
48424.725 661
 
4.5%
124190.6005 638
 
4.4%
77919.62963 637
 
4.4%
371114.7381 637
 
4.4%
225047.4167 615
 
4.2%
626362.2639 614
 
4.2%
622491.6198 446
 
3.1%
767778.4931 373
 
2.6%
26360.94318 359
 
2.5%
Other values (8043) 8930
61.1%
ValueCountFrequency (%)
0 699
4.8%
5.411764706 1
 
< 0.1%
10.1 1
 
< 0.1%
12.57142857 1
 
< 0.1%
20.84 1
 
< 0.1%
23.81818182 1
 
< 0.1%
28.83333333 1
 
< 0.1%
35.71428571 1
 
< 0.1%
36.07692308 1
 
< 0.1%
37.33333333 1
 
< 0.1%
ValueCountFrequency (%)
24550000 1
< 0.1%
22985558.5 1
< 0.1%
22825000 1
< 0.1%
21634217 1
< 0.1%
18307280.75 1
< 0.1%
17250000 1
< 0.1%
16995773.94 1
< 0.1%
15964244.73 1
< 0.1%
15819461.75 1
< 0.1%
15350000 1
< 0.1%

Idle Mean
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct7254
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10422189
Minimum0
Maximum90000000
Zeros599
Zeros (%)4.1%
Negative0
Negative (%)0.0%
Memory size114.3 KiB
2022-12-12T20:36:39.334768image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1471855.5
Q17673684.2
median10804545
Q313255556
95-th percentile18733697
Maximum90000000
Range90000000
Interquartile range (IQR)5581871.3

Descriptive statistics

Standard deviation5863050.6
Coefficient of variation (CV)0.56255461
Kurtosis7.0577517
Mean10422189
Median Absolute Deviation (MAD)2586897.9
Skewness1.4552386
Sum1.5225776 × 1011
Variance3.4375363 × 1013
MonotonicityNot monotonic
2022-12-12T20:36:39.437791image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2429015.506 661
 
4.5%
10958811.67 638
 
4.4%
13255555.56 638
 
4.4%
11442857.08 637
 
4.4%
9137500 616
 
4.2%
13588888.89 614
 
4.2%
0 599
 
4.1%
30562500 446
 
3.1%
13522219.51 373
 
2.6%
10881818.18 359
 
2.5%
Other values (7244) 9028
61.8%
ValueCountFrequency (%)
0 599
4.1%
329924.8333 1
 
< 0.1%
450000 1
 
< 0.1%
456250 1
 
< 0.1%
480952.381 1
 
< 0.1%
513994.8553 1
 
< 0.1%
541990.3333 1
 
< 0.1%
600000 1
 
< 0.1%
610361.4531 1
 
< 0.1%
617391.3043 1
 
< 0.1%
ValueCountFrequency (%)
90000000 1
< 0.1%
72000000 1
< 0.1%
59950000 1
< 0.1%
49633333.33 1
< 0.1%
45000000 1
< 0.1%
44133333.33 1
< 0.1%
42466592.18 1
< 0.1%
40000000 1
< 0.1%
39733333.33 1
< 0.1%
39566666.67 1
< 0.1%

Interactions

2022-12-12T20:36:35.970126image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:26.731687image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.779986image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.743297image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.817073image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.884407image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.853708image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.913656image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.873971image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.941819image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.059146image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:26.844713image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.877007image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.938874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.912094image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.984472image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.950263image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.012222image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.977530image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.035841image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.150167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:26.943265image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.975078image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.037896image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.013649image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.082494image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.049285image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.111244image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.170573image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.131862image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.244188image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.043288image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.074101image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.137918image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.112672image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.180516image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.149484image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.211267image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.271596image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.226884image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.335209image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.214326image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.170122image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.235941image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.208693image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.276538image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.245506image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.307288image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.368618image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.319904image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.422228image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.307348image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.266144image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.332962image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.302715image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.374560image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.437549image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.401309image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.467176image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.410925image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.513863image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.404369image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.363166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.431984image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.400737image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.472622image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.535572image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.499887image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.565198image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.515493image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.604883image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.501924image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.462240image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.532009image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.500321image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.569643image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.632593image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.596908image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.662220image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.608514image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.694903image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.598946image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.561256image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.633031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.705367image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.672667image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.729614image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.691930image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.759242image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.702535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:36.781923image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:27.691967image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:28.654277image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:29.727052image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:30.796387image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:31.766688image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:32.823636image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:33.785951image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:34.852262image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-12-12T20:36:35.884575image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-12-12T20:36:39.532348image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-12T20:36:39.674380image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-12T20:36:39.814411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-12T20:36:39.954983image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-12T20:36:40.197038image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-12T20:36:36.893948image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-12T20:36:37.033506image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow DurationFwd Pkt Len MeanFwd IAT MeanPkt Len MeanPkt Len VarPkt Size AvgFwd Seg Size AvgBwd Seg Size AvgActive MeanIdle Mean
011557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
111557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
211557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
311557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
411557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
511557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
611557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
711557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
811557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
911557810.88352.89683771865.17121353.24032781.800395524.079092352.896837351.00.00.0
Flow DurationFwd Pkt Len MeanFwd IAT MeanPkt Len MeanPkt Len VarPkt Size AvgFwd Seg Size AvgBwd Seg Size AvgActive MeanIdle Mean
145992.892786e+05420.1250000.000000420.1250000.000000630.187500420.125000420.1250000.000000.000
146006.034903e+06638.1558916307.749163638.156757252.213326927.056103638.155891638.45833319958.208332050000.000
146012.599031e+05516.5555560.000000516.5555560.000000774.833333516.555556516.5555560.000000.000
146023.636731e+06613.078743643641.331200613.08298623.394605908.071020613.078743613.3333330.00000625000.000
146031.917671e+05512.8888891755.796297512.8888890.000000769.333333512.888889512.8888890.000000.000
146042.212897e+07477.334387713692.217300475.2733521988.924082699.971666477.334387469.040000319135.180001449650.650
146056.624616e+06546.666667229044.261800546.6666670.000000820.000000546.666667546.6666670.000000.000
146064.334327e+06497.346154275173.718000496.9615392.884615744.079487497.346154496.76923115049.528852534549.375
146076.107274e+06472.000000488122.559600472.0000000.000000708.000000472.000000472.00000071239.642861387979.893
146087.698330e+06527.006937994409.903000527.027483435.623800759.971744527.006937527.88461533627.971151296990.404

Duplicate rows

Most frequently occurring

Flow DurationFwd Pkt Len MeanFwd IAT MeanPkt Len MeanPkt Len VarPkt Size AvgFwd Seg Size AvgBwd Seg Size AvgActive MeanIdle Mean# duplicates
28.810873e+06509.5169944.848206e+05509.521426190.161645741.880738509.516994510.00000048424.725002.429016e+06661
141.137066e+08236.6956491.554345e+06236.6341501187.088024237.810154236.695649239.470588124190.600501.095881e+07638
81.067727e+0884.6929082.835132e+0684.813513357.67028886.81375384.69290893.22222277919.629631.325556e+07637
111.117534e+0866.9255114.628603e+0670.7412033579.00963974.51799066.92551190.285714371114.738101.144286e+07637
261.158609e+08606.5942161.247333e+06606.6270492990.705458608.358666606.594216610.500000225047.416709.137500e+06615
191.152322e+08237.9919744.001714e+06232.4193876012.953024236.451409237.991974217.333333626362.263901.358889e+07614
71.056446e+08435.6288997.304311e+06429.0506417903.578068435.648797435.628899407.625000622491.619803.056250e+07446
31.155781e+07352.8968377.186517e+04353.24032781.800395524.079092352.896837351.0000000.000000.000000e+00373
151.141620e+08318.2903525.323309e+06315.4352583352.507692328.437095318.290352307.444444767778.493101.352222e+07373
291.168877e+08207.6141922.164659e+06207.6122621721.908717208.898652207.614192213.72727326360.943181.088182e+07359